Speech/non-speech classification using multiple features for robust endpoint detection
نویسندگان
چکیده
In this paper, we describe a new speech/non-speech classification method that improves the endpoint detection performance for speech recognition in noisy environments. The proposed method uses multiple features to increase the robustness in noisy environments, and the classification and regression tree(CART) technique is applied to effectively combine these multiple features for classification of each frame. We evaluate the performance of the proposed method by conducting speech/non-speech classification experiments on noisy speech. We also investigate the importance of various features on speech/non-speech classification in noisy environments In particular, the proposed method is applies to the endpoint detection algorithm for isolated speech recognition of voicedialing cellular phone. We simulate the speech recognition experiments in various noise environments, and the effects of proposed method on speech recognition performance are evaluated.
منابع مشابه
Phoneme Classification Using Temporal Tracking of Speech Clusters in Spectro-temporal Domain
This article presents a new feature extraction technique based on the temporal tracking of clusters in spectro-temporal features space. In the proposed method, auditory cortical outputs were clustered. The attributes of speech clusters were extracted as secondary features. However, the shape and position of speech clusters change during the time. The clusters temporally tracked and temporal tra...
متن کاملImproving of Feature Selection in Speech Emotion Recognition Based-on Hybrid Evolutionary Algorithms
One of the important issues in speech emotion recognizing is selecting of appropriate feature sets in order to improve the detection rate and classification accuracy. In last studies researchers tried to select the appropriate features for classification by using the selecting and reducing the space of features methods, such as the Fisher and PCA. In this research, a hybrid evolutionary algorit...
متن کاملClassification of emotional speech using spectral pattern features
Speech Emotion Recognition (SER) is a new and challenging research area with a wide range of applications in man-machine interactions. The aim of a SER system is to recognize human emotion by analyzing the acoustics of speech sound. In this study, we propose Spectral Pattern features (SPs) and Harmonic Energy features (HEs) for emotion recognition. These features extracted from the spectrogram ...
متن کاملAn Information-Theoretic Discussion of Convolutional Bottleneck Features for Robust Speech Recognition
Convolutional Neural Networks (CNNs) have been shown their performance in speech recognition systems for extracting features, and also acoustic modeling. In addition, CNNs have been used for robust speech recognition and competitive results have been reported. Convolutive Bottleneck Network (CBN) is a kind of CNNs which has a bottleneck layer among its fully connected layers. The bottleneck fea...
متن کاملRobust entropy-based endpoint detection for speech recognition in noisy environments
This paper presents an entropy-based algorithm for accurate and robust endpoint detection for speech recognition under noisy environments. Instead of using the conventional energy-based features, the spectral entropy is developed to identify the speech segments accurately. Experimental results show that this algorithm outperforms the energy-based algorithms in both detection accuracy and recogn...
متن کامل